Rapid and accurate taxonomic classification of insect (class Insecta) cytochrome c oxidase subunit 1 (COI) DNA barcode sequences using a nave Bayesian classifier

نویسندگان

  • TERESITA M. PORTER
  • JOEL F. GIBSON
  • SHADI SHOKRALLA
  • DONALD J. BAIRD
  • G. BRIAN GOLDING
چکیده

Current methods to identify unknown insect (class Insecta) cytochrome c oxidase (COI barcode) sequences often rely on thresholds of distances that can be difficult to define, sequence similarity cut-offs, or monophyly. Some of the most commonly used metagenomic classification methods do not provide a measure of confidence for the taxonomic assignments they provide. The aim of this study was to use a na€ıve Bayesian classifier (Wang et al. Applied and Environmental Microbiology, 2007; 73: 5261) to automate taxonomic assignments for large batches of insect COI sequences such as data obtained from high-throughput environmental sequencing. This method provides rank-flexible taxonomic assignments with an associated bootstrap support value, and it is faster than the BLAST-based methods commonly used in environmental sequence surveys. We have developed and rigorously tested the performance of three different training sets using leave-one-out cross-validation, two field data sets, and targeted testing of Lepidoptera, Diptera and Mantodea sequences obtained from the Barcode of Life Data system. We found that type I error rates, incorrect taxonomic assignments with a high bootstrap support, were already relatively low but could be lowered further by ensuring that all query taxa are actually present in the reference database. Choosing bootstrap support cut-offs according to query length and summarizing taxonomic assignments to more inclusive ranks can also help to reduce error while retaining the maximum number of assignments. Additionally, we highlight gaps in the taxonomic and geographic representation of insects in public sequence databases that will require further work by taxonomists to improve the quality of assignments generated using any method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid and accurate taxonomic classification of insect (class Insecta) cytochrome c oxidase subunit 1 (COI) DNA barcode sequences using a naïve Bayesian classifier

Supplementary Methods 1 2 Customizing the training sets 3 The format for the training files are described in the original RDP classifier 4 version 2.5 sample data folder that comes with the distribution available from 5 http://sourceforge.net/projects/rdp-classifier/ (Wang et al. 2007). For each of our training 6 sets, the two files we used to train the classifier are provided so that they can ...

متن کامل

Study of genetic diversity of Dussumieria acuta (Valenciennes, 1847) in Persian Gulf and Oman sea (Coast of the Hormozgan Province) using Cytochrome oxidase subunit I gene (COI)

In this study 12 specimens were collected from Bandar Jask, Qeshm Island and Bandare Lengeh in Hormozgan Province. DNA extraction was performed using Phenol-Chloroform method. A partial DNA sequence of Cytochrome oxidase subunit I gene (COI) was used to evaluate genetic diversity. The sequence of Cytochrome oxidase subunit I gene was done using specific primers designed based on sequences regis...

متن کامل

A DNA barcode library for ground beetles (Insecta, Coleoptera, Carabidae) of Germany: The genus Bembidion Latreille, 1802 and allied taxa

As molecular identification method, DNA barcoding based on partial cytochrome c oxidase subunit 1 (COI) sequences has been proven to be a useful tool for species determination in many insect taxa including ground beetles. In this study we tested the effectiveness of DNA barcodes to discriminate species of the ground beetle genus Bembidion and some closely related taxa of Germany. DNA barcodes w...

متن کامل

DNA Barcoding of Fish, Insects, and Shellfish in Korea

DNA barcoding has been widely used in species identification and biodiversity research. A short fragment of the mitochondrial cytochrome c oxidase subunit I (COI) sequence serves as a DNA bio-barcode. We collected DNA barcodes, based on COI sequences from 156 species (529 sequences) of fish, insects, and shellfish. We present results on phylogenetic relationships to assess biodiversity the in t...

متن کامل

DNA barcoding facilitates description of unknown faunas: a case study on Trichoptera in the headwaters of the Tigris River, Iraq

Monitoring water quality with aquatic insects as sentinels requires taxonomic knowledge of adult and immature life stages that is not available in many parts of the world. We used deoxyribonucleic acid (DNA) barcoding to expedite identification of larval caddisflies from 20 sites in the headwaters of the Tigris River in northern Iraq by comparing their mitochondrial cytochrome c oxidase subunit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014